A Statistical Mystery
New Study Shows: Less Sleep = Higher Test Scores!
Study Design:
Warning
Note: This is simulated data for teaching purposes only
The shocking result…
Each extra hour of sleep reduces scores by 9.8 points!
Let’s look at each student individually to check if this makes sense.
Maybe there’s something we’re missing?
Student A: More sleep = Better scores ✓
Student B: More sleep = Better scores ✓
Student C: More sleep = Better scores ✓
Student D: More sleep = Better scores ✓
Student E: More sleep = Better scores ✓
Student F: More sleep = Better scores ✓
Every single student shows: More sleep = Better performance
But the aggregate data showed: Less sleep = Better performance!
How is this possible?
Let’s look at that first plot again…
Question: How can the overall trend be negative when every individual trend is positive?
Notice anything?
The Mystery: Students with different baseline scores have different sleep patterns!
When we have repeated measures, students can differ in two fundamental ways:
Let’s build up the model step by step to see how we capture these differences…
Random Intercept Model: lmer(score ~ sleep + (1 | student))
This model assumes students differ in baseline performance, but respond the same way to sleep.
Each student starts at a different level, but model assumes sleep affects everyone equally
Random Slope Model: lmer(score ~ sleep + (0 + sleep | student))
This model assumes students start at the same baseline, but respond differently to sleep.
Some students are more “sleep-sensitive” (steeper slopes) than others
Random Slopes & Intercepts Model: lmer(score ~ sleep + (sleep | student))
This model allows students to differ in both baseline AND their response to sleep.
This captures both sources of variation we see in the data
The full model (sleep | student) seems best, so why use simpler models?
Note
Best practice: Start complex, simplify if needed. Use model comparison (AIC, BIC, likelihood ratio tests) to guide decisions.
Our data shows BOTH sources of variation:
Random intercepts (baselines) + Random slopes (sleep sensitivity) = Full model
“Does sleep affect test performance?” is actually THREE different questions!
Each question requires a different model and gives a different answer!
“Do students who sleep more score better?”
Answer: NO → This is confounded by individual differences!
“When each student sleeps more, do they score better?”
Answer: YES → Every student benefits! But effects vary (some steep, some flat)
The dashed line represents the average effect WITHIN students
The Research Question Determines The Model
The hierarchical model: - Separates within-person from between-person effects - Gives us the average causal effect we care about - Accounts for individual differences - Provides the right answer to “does sleep help?”
lm(y ~ x) - Between-person only (WRONG here)lmer(y ~ x + (1 | id)) - Different baselineslmer(y ~ x + (x | id)) - Different baselines + responses (BEST)The moral of the story: When you have repeated measures, you MUST account for individual differences, or you might conclude the exact opposite of the truth!